This uses pointblank to create a data validation report.
In the resulting table at the end, any failing tests should have a CSV
button that lets you download a .csv file of just the rows of data that
don’t pass that particular validation step.
Action levels
By default, warn if 1 or more rows fail conditions and error if 2% or more fail. Some checks are run with a stricter action level that errors if any rows fail.
al_default <- action_levels(warn_at = 1, stop_at = 0.02) #warn if even row fails, error if 2% of rows fail
al_strict <- action_levels(stop_at = 1) #error if even one row fails
The two datasets being submitted with the data paper are
HDP_plots.csv and HDP_1997_2009.csv
Checks for data type, range, and duplicates
| Pointblank Validation | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Data Validation
tibbleWARN
1
STOP
0.02
NOTIFY
—
|
|||||||||||||
| STEP | COLUMNS | VALUES | TBL | EVAL | UNITS | PASS | FAIL | W | S | N | EXT | ||
| NA | 1 | col_vals_in_set()
|
|
✓ |
67K |
67K1 |
00 |
— |
○ |
— |
— | ||
| NA | 2 | col_vals_in_set()
|
|
✓ |
67K |
67K1 |
00 |
— |
○ |
— |
— | ||
| NA | 3 | Height is measured to nearest cm
|
|
✓ |
57K |
57K1 |
00 |
— |
○ |
— |
— | ||
| NA | 4 | Shoots is interger
|
|
✓ |
57K |
57K1 |
00 |
— |
○ |
— |
— | ||
| NA | 5 | Number of inflorescences is integer
|
|
✓ |
2K |
2K1 |
00 |
— |
○ |
— |
— | ||
| NA | 6 | shoots between 0 and 20
|
|
✓ |
67K |
67K1 |
80 |
● |
○ |
— |
|||
| NA | 7 | height between 0 and 200cm
|
|
✓ |
67K |
67K1 |
20 |
● |
○ |
— |
|||
| NA | 8 | infloresences between 0 and 3
|
|
✓ |
67K |
67K1 |
150 |
● |
○ |
— |
|||
| NA | 9 | duplicated rows
|
NA |
|
✓ |
67K |
67K1 |
00 |
— |
○ |
— |
— | |
| NA | 10 | col_vals_not_null()
|
NA |
|
✓ |
67K |
67K1 |
00 |
— |
○ |
— |
— | |
| NA | 11 | Check for duplicate ID's within each year
|
NA |
|
✓ |
3K |
3K1 |
00 |
— |
○ |
— |
— | |
| NA | 12 | Check for duplicate ID's within each year
|
NA |
|
✓ |
4K |
4K1 |
00 |
— |
○ |
— |
— | |
| NA | 13 | Check for duplicate ID's within each year
|
NA |
|
✓ |
5K |
5K1 |
00 |
— |
○ |
— |
— | |
| NA | 14 | Check for duplicate ID's within each year
|
NA |
|
✓ |
6K |
6K1 |
00 |
— |
○ |
— |
— | |
| NA | 15 | Check for duplicate ID's within each year
|
NA |
|
✓ |
6K |
6K1 |
00 |
— |
○ |
— |
— | |
| NA | 16 | Check for duplicate ID's within each year
|
NA |
|
✓ |
6K |
6K1 |
00 |
— |
○ |
— |
— | |
| NA | 17 | Check for duplicate ID's within each year
|
NA |
|
✓ |
6K |
6K1 |
00 |
— |
○ |
— |
— | |
| NA | 18 | Check for duplicate ID's within each year
|
NA |
|
✓ |
6K |
6K1 |
00 |
— |
○ |
— |
— | |
| NA | 19 | Check for duplicate ID's within each year
|
NA |
|
✓ |
7K |
7K1 |
00 |
— |
○ |
— |
— | |
| NA | 20 | Check for duplicate ID's within each year
|
NA |
|
✓ |
5K |
5K1 |
00 |
— |
○ |
— |
— | |
| NA | 21 | Check for duplicate ID's within each year
|
NA |
|
✓ |
6K |
6K1 |
00 |
— |
○ |
— |
— | |
| NA | 22 | Check for duplicate ID's within each year
|
NA |
|
✓ |
6K |
6K1 |
00 |
— |
○ |
— |
— | |
| 2022-12-06 20:03:03 UTC 6.3 s 2022-12-06 20:03:09 UTC | |||||||||||||
Checks that year to year change in size is reasonable
| Pointblank Validation | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Check growth & regression
tibbleWARN
1
STOP
0.02
NOTIFY
—
|
|||||||||||||
| STEP | COLUMNS | VALUES | TBL | EVAL | UNITS | PASS | FAIL | W | S | N | EXT | ||
| NA | 1 | |% change in height| < 200%
|
|
✓ |
67K |
66K1 |
4220 |
● |
○ |
— |
|||
| NA | 2 | |∆ height| < 100cm
|
|
✓ |
67K |
67K1 |
110 |
— |
● |
— |
|||
| NA | 3 | |∆ shoot number| < 5
|
|
✓ |
67K |
67K1 |
2010 |
— |
● |
— |
|||
| 2022-12-06 20:03:14 UTC < 1 s 2022-12-06 20:03:14 UTC | |||||||||||||
Check that size of seedlings is reasonable